Add MPS control daemon support to k8s device plugin #789

yeazelm · 2025-12-31T22:46:23Z

Issue number:

Related to: bottlerocket-os/bottlerocket#4673

Description of changes:
This builds the mps-control-daemon binary from the device plugin that allows MPS support. We have to patch the hardcoded paths for Bottlerocket usage since the device plugin assumes it can write to / which doesn't work with Bottlerocket.

This change also adds a new service to start this binary when settings request it. Otherwise it daemonizes sleep infinity to let systemd try-restart upon changing the settings for MPS.

The change should be safe to take without the bottlerocket-os/bottlerocket-kernel-kit#347 change or the upcoming settings change but the daemon will not work without the kmod update and the settings being properly set.

Testing done:
Build images with the kernel change, settings changes, and validated that a node will come up with MPS working if set in user data, and the services are restarted and MPS can be enabled at runtime as well.

Setting in userdata for a g6.2xlarge which only has one GPU

Details

eksctl config snippet for setting it at the beginning:

    bottlerocket:
      settings:
        kubelet-device-plugins:
          nvidia:
            device-sharing-strategy: "mps"
            mps:
              replicas: 2

Results in a node reporting nvidia.com/gpu.shared:

Capacity:
  cpu:                    8
  ephemeral-storage:      81854Mi
  hugepages-1Gi:          0
  hugepages-2Mi:          0
  memory:                 31619656Ki
  nvidia.com/gpu.shared:  2
  pods:                   58
Allocatable:
  cpu:                    7910m
  ephemeral-storage:      76173383962
  hugepages-1Gi:          0
  hugepages-2Mi:          0
  memory:                 30602824Ki
  nvidia.com/gpu.shared:  2
  pods:                   58

Setting the MPS after boot

Details

Start with a node with no configuration for MPS:

# apiclient get settings.kubelet-device-plugins.nvidia
{
  "settings": {
    "kubelet-device-plugins": {
      "nvidia": {
        "device-id-strategy": "index",
        "device-list-strategy": "cdi-cri",
        "device-partitioning-strategy": "none",
        "device-sharing-strategy": "none",
        "pass-device-specs": true
      }
    }
  }
}

# systemctl status
● ip-192-168-12-91.us-west-2.compute.internal
    State: running
    Units: 458 loaded (incl. loaded aliases)
     Jobs: 0 queued
   Failed: 0 units
    Since: Wed 2025-12-31 22:32:18 UTC; 5min ago
  systemd: 257.9
  Tainted: unmerged-bin
   CGroup: /
....

# systemctl status nvidia-mps-control-daemon
● nvidia-mps-control-daemon.service - NVIDIA MPS Control Daemon
     Loaded: loaded (/x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/nvidia-mps-control-daemon.service; enabled; preset: enabled)
    Drop-In: /x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/service.d
             └─00-aws-config.conf
             /etc/systemd/system/nvidia-mps-control-daemon.service.d
             └─exec-start.conf
     Active: active (running) since Wed 2025-12-31 22:32:32 UTC; 5min ago
 Invocation: d1565c1130dc4d9e87108f540f1178da
   Main PID: 3111 (/usr/bin/sleep)
      Tasks: 1 (limit: 36988)
     Memory: 308K (peak: 1.2M)
        CPU: 5ms
     CGroup: /system.slice/nvidia-mps-control-daemon.service
             └─3111 /usr/bin/sleep infinity

Dec 31 22:32:32 ip-... systemd[1]: Started NVIDIA MPS Control Daemon.

# systemctl cat nvidia-mps-control-daemon
# /x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/nvidia-mps-control-daemon.service
[Unit]
Description=NVIDIA MPS Control Daemon
After=nvidia-k8s-device-plugin.service
Requires=nvidia-k8s-device-plugin.service

[Service]
Type=simple
ExecStart=/bin/true
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target

# /x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/service.d/00-aws-config.conf
[Service]
# Set the AWS_SDK_LOAD_CONFIG system-wide instead of at the individual service
# level, to make sure new system services that use the AWS SDK for Go read the
# shared AWS config
Environment=AWS_SDK_LOAD_CONFIG=true

# /etc/systemd/system/nvidia-mps-control-daemon.service.d/exec-start.conf
[Service]
ExecStart=
ExecStart=/usr/bin/sleep infinity

The node shows one GPU:

Capacity:
  cpu:                8
  ephemeral-storage:  81854Mi
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             31619660Ki
  nvidia.com/gpu:     1
  pods:               58
Allocatable:
  cpu:                7910m
  ephemeral-storage:  76173383962
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             30602828Ki
  nvidia.com/gpu:     1
  pods:               58

Then set MPS:

apiclient set settings.kubelet-device-plugins.nvidia.device-sharing-strategy=mps settings.kubelet-device-plugins.nvidia.mps.replicas=8


bash-5.1# apiclient get settings.kubelet-device-plugins.nvidia
{
  "settings": {
    "kubelet-device-plugins": {
      "nvidia": {
        "device-id-strategy": "index",
        "device-list-strategy": "cdi-cri",
        "device-partitioning-strategy": "none",
        "device-sharing-strategy": "mps",
        "mps": {
          "replicas": 8
        },
        "pass-device-specs": true
      }
    }
  }
}

Now check the rest of the system:

# systemctl status
● ip-192-168-12-91.us-west-2.compute.internal
    State: running
    Units: 458 loaded (incl. loaded aliases)
     Jobs: 0 queued
   Failed: 0 units
    Since: Wed 2025-12-31 22:32:18 UTC; 7min ago
  systemd: 257.9
  Tainted: unmerged-bin
   CGroup: /
           ├─default
...
# systemctl status nvidia-mps-control-daemon
● nvidia-mps-control-daemon.service - NVIDIA MPS Control Daemon
     Loaded: loaded (/x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/nvidia-mps-control-daemon.service; enabled; preset: enabled)
    Drop-In: /x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/service.d
             └─00-aws-config.conf
             /etc/systemd/system/nvidia-mps-control-daemon.service.d
             └─exec-start.conf
     Active: active (running) since Wed 2025-12-31 22:39:41 UTC; 36s ago
 Invocation: 7191c5bc120246709e113d50d3ce3c54
   Main PID: 6994 (mps-control-dae)
      Tasks: 12 (limit: 36988)
     Memory: 49.1M (peak: 62M)
        CPU: 227ms
     CGroup: /system.slice/nvidia-mps-control-daemon.service
             ├─6994 /usr/bin/mps-control-daemon --config-file /etc/nvidia-k8s-device-plugin/settings.yaml
             ├─7015 nvidia-cuda-mps-control -d
             └─7021 tail -n +1 -f /run/mps/nvidia.com/gpu.shared/log/control.log

Dec 31 22:39:41 ip-192-168-12-91.us-west-2.compute.internal mps-control-daemon[7021]: [2025-12-31 22:39:41.892 Control  7015] Accepting connection...
Dec 31 22:39:41 ip-192-168-12-91.us-west-2.compute.internal mps-control-daemon[7021]: [2025-12-31 22:39:41.892 Control  7015] NEW UI
Dec 31 22:39:41 ip-192-168-12-91.us-west-2.compute.internal mps-control-daemon[7021]: [2025-12-31 22:39:41.892 Control  7015] Cmd:set_default_active_thread_percentage 12
Dec 31 22:39:41 ip-192-168-12-91.us-west-2.compute.internal mps-control-daemon[7021]: [2025-12-31 22:39:41.892 Control  7015] 12.0
Dec 31 22:39:41 ip-192-168-12-91.us-west-2.compute.internal mps-control-daemon[7021]: [2025-12-31 22:39:41.892 Control  7015] UI closed
Dec 31 22:40:11 ip-192-168-12-91.us-west-2.compute.internal mps-control-daemon[7021]: [2025-12-31 22:40:11.832 Control  7015] Accepting connection...
Dec 31 22:40:11 ip-192-168-12-91.us-west-2.compute.internal mps-control-daemon[7021]: [2025-12-31 22:40:11.832 Control  7015] NEW UI
Dec 31 22:40:11 ip-192-168-12-91.us-west-2.compute.internal mps-control-daemon[7021]: [2025-12-31 22:40:11.832 Control  7015] Cmd:get_default_active_thread_percentage
Dec 31 22:40:11 ip-192-168-12-91.us-west-2.compute.internal mps-control-daemon[7021]: [2025-12-31 22:40:11.832 Control  7015] 12.0
Dec 31 22:40:11 ip-192-168-12-91.us-west-2.compute.internal mps-control-daemon[7021]: [2025-12-31 22:40:11.832 Control  7015] UI closed

# systemctl cat nvidia-mps-control-daemon
# /x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/nvidia-mps-control-daemon.service
[Unit]
Description=NVIDIA MPS Control Daemon
After=nvidia-k8s-device-plugin.service
Requires=nvidia-k8s-device-plugin.service

[Service]
Type=simple
ExecStart=/bin/true
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target

# /x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/service.d/00-aws-config.conf
[Service]
# Set the AWS_SDK_LOAD_CONFIG system-wide instead of at the individual service
# level, to make sure new system services that use the AWS SDK for Go read the
# shared AWS config
Environment=AWS_SDK_LOAD_CONFIG=true

# /etc/systemd/system/nvidia-mps-control-daemon.service.d/exec-start.conf
[Service]
ExecStart=
ExecStart=/usr/bin/mps-control-daemon --config-file /etc/nvidia-k8s-device-plugin/settings.yaml

# cat /etc/nvidia-k8s-device-plugin/settings.yaml
version: v1
flags:
  migStrategy: "none"
  failOnInitError: true
  nvidiaDriverRoot: "/"
  mpsRoot: "/run/nvidia/mps"
  plugin:
    passDeviceSpecs: true
    deviceListStrategy: cdi-cri
    deviceIDStrategy: index
    containerDriverRoot: "/"
sharing:
  mps:
    renameByDefault: true
    resources:
    - name: "nvidia.com/gpu"
      replicas: 8

And the node shows the empty nvidia.com/gpu offering but now a shared one:

Capacity:
  cpu:                    8
  ephemeral-storage:      81854Mi
  hugepages-1Gi:          0
  hugepages-2Mi:          0
  memory:                 31619660Ki
  nvidia.com/gpu:         1
  nvidia.com/gpu.shared:  8
  pods:                   58
Allocatable:
  cpu:                    7910m
  ephemeral-storage:      76173383962
  hugepages-1Gi:          0
  hugepages-2Mi:          0
  memory:                 30602828Ki
  nvidia.com/gpu:         0
  nvidia.com/gpu.shared:  8
  pods:                   58

This is a known edge case and is similar to how timeslicing works. In order to avoid old resources, you'd need to start with the user-data approach.

Shifting to rename-by-default=false(apiclient set settings.kubelet-device-plugins.nvidia.mps.rename-by-default=false) will have the original nvidia.com/gpu resource instead:

Capacity:
  cpu:                    8
  ephemeral-storage:      81854Mi
  hugepages-1Gi:          0
  hugepages-2Mi:          0
  memory:                 31619660Ki
  nvidia.com/gpu:         8
  nvidia.com/gpu.shared:  8
  pods:                   58
Allocatable:
  cpu:                    7910m
  ephemeral-storage:      76173383962
  hugepages-1Gi:          0
  hugepages-2Mi:          0
  memory:                 30602828Ki
  nvidia.com/gpu:         8
  nvidia.com/gpu.shared:  0
  pods:                   58

And finally, setting sharing to none disables MPS:

# apiclient set settings.kubelet-device-plugins.nvidia.device-sharing-strategy=none
# systemctl status nvidia-mps-control-daemon
● nvidia-mps-control-daemon.service - NVIDIA MPS Control Daemon
     Loaded: loaded (/x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/nvidia-mps-control-daemon.service; enabled; preset: enabled)
    Drop-In: /x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/service.d
             └─00-aws-config.conf
             /etc/systemd/system/nvidia-mps-control-daemon.service.d
             └─exec-start.conf
     Active: active (running) since Wed 2025-12-31 22:44:41 UTC; 2s ago
 Invocation: 82664a64fd044762a81ecef6d1cc0462
   Main PID: 9436 (/usr/bin/sleep)
      Tasks: 1 (limit: 36988)
     Memory: 308K (peak: 1.2M)
        CPU: 4ms
     CGroup: /system.slice/nvidia-mps-control-daemon.service
             └─9436 /usr/bin/sleep infinity

Dec 31 22:44:41 ip-192-168-12-91.us-west-2.compute.internal systemd[1]: Started NVIDIA MPS Control Daemon.

And the resource goes back down to 1.

With the incompatibility checks in the template. You can see the messages preventing both MIG and MPS from running at the same time:

Details

Jan 15 16:01:19 ip-192-168-23-52.us-west-2.compute.internal systemd[1]: Starting NVIDIA MPS Control Daemon...
Jan 15 16:01:19 ip-192-168-23-52.us-west-2.compute.internal echo[11584]: MPS and MIG are not supported at the same time
Jan 15 16:01:19 ip-192-168-23-52.us-west-2.compute.internal systemd[1]: Finished NVIDIA MPS Control Daemon.

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

packages/nvidia-k8s-device-plugin/nvidia-mps-control-daemon-exec-start-conf

packages/nvidia-k8s-device-plugin/nvidia-mps-control-daemon.service

packages/nvidia-k8s-device-plugin/nvidia-mps-control-daemon-exec-start-conf

packages/nvidia-k8s-device-plugin/nvidia-mps-control-daemon.service

yeazelm · 2026-01-15T16:10:37Z

^ Updated the code to use the Type changes (thanks @KCSesh!) and responded to a few other comments.

There is also a new change that does the MIG and MPS incompatibility check in the template rendering. It echo's a warning these don't work together. This can easily be removed if NVIDIA removes this incompatibility in a future release of their device plugin.

Jan 15 16:01:19 ip-192-168-23-52.us-west-2.compute.internal systemd[1]: Starting NVIDIA MPS Control Daemon...
Jan 15 16:01:19 ip-192-168-23-52.us-west-2.compute.internal echo[11584]: MPS and MIG are not supported at the same time
Jan 15 16:01:19 ip-192-168-23-52.us-west-2.compute.internal systemd[1]: Finished NVIDIA MPS Control Daemon.

packages/nvidia-k8s-device-plugin/nvidia-k8s-device-plugin-conf

packages/nvidia-k8s-device-plugin/nvidia-mps-control-daemon.service

Add support for NVIDIA Multi-Process Service (MPS) control daemon, including service configuration and device plugin updates. Signed-off-by: Matthew Yeazel <[email protected]>

yeazelm · 2026-01-15T20:09:39Z

^ Updated to address comments around RemainAfterExit and default noop settings.

yeazelm requested review from bcressey, cbgbt and piyush-jena December 31, 2025 22:46

This was referenced Dec 31, 2025

feat: add MPS GPU sharing settings model bottlerocket-os/bottlerocket-settings-sdk#107

Merged

Support for CUDA MPS bottlerocket-os/bottlerocket#4673

Open

vigh-m reviewed Jan 6, 2026

View reviewed changes

packages/nvidia-k8s-device-plugin/nvidia-mps-control-daemon-exec-start-conf Outdated Show resolved Hide resolved

bcressey reviewed Jan 9, 2026

View reviewed changes

packages/nvidia-k8s-device-plugin/nvidia-mps-control-daemon.service Outdated Show resolved Hide resolved

packages/nvidia-k8s-device-plugin/nvidia-mps-control-daemon-exec-start-conf Show resolved Hide resolved

KCSesh reviewed Jan 13, 2026

View reviewed changes

packages/nvidia-k8s-device-plugin/nvidia-mps-control-daemon.service Outdated Show resolved Hide resolved

yeazelm force-pushed the add_mps branch from 39a4394 to db8ed30 Compare January 15, 2026 16:04

bcressey reviewed Jan 15, 2026

View reviewed changes

packages/nvidia-k8s-device-plugin/nvidia-k8s-device-plugin-conf Show resolved Hide resolved

packages/nvidia-k8s-device-plugin/nvidia-mps-control-daemon.service Show resolved Hide resolved

packages/nvidia-k8s-device-plugin/nvidia-mps-control-daemon.service Show resolved Hide resolved

nvidia-k8s-device-plugin: add MPS control daemon support

232b267

Add support for NVIDIA Multi-Process Service (MPS) control daemon, including service configuration and device plugin updates. Signed-off-by: Matthew Yeazel <[email protected]>

yeazelm force-pushed the add_mps branch from db8ed30 to 232b267 Compare January 15, 2026 20:06

bcressey approved these changes Jan 15, 2026

View reviewed changes

KCSesh approved these changes Jan 15, 2026

View reviewed changes

yeazelm merged commit 4731f9f into bottlerocket-os:develop Jan 16, 2026
2 checks passed

yeazelm mentioned this pull request Jan 16, 2026

Add in changes to support MPS settings bottlerocket-os/bottlerocket#4744

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add MPS control daemon support to k8s device plugin #789

Add MPS control daemon support to k8s device plugin #789

yeazelm commented Dec 31, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yeazelm commented Jan 15, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yeazelm commented Jan 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add MPS control daemon support to k8s device plugin #789

Add MPS control daemon support to k8s device plugin #789

Conversation

yeazelm commented Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Setting in userdata for a g6.2xlarge which only has one GPU

Setting the MPS after boot

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yeazelm commented Jan 15, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yeazelm commented Jan 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yeazelm commented Dec 31, 2025 •

edited

Loading